Product Code Database
Example Keywords: scarf -trousers $81-163
   » » Wiki: Alphabet (formal Languages)
Tag Wiki 'Alphabet (formal Languages)'.
Tag

Alphabet (formal languages)
 (

 C O N T E N T S 
Rank: 100%
Bluestar Bluestar Bluestar Bluestar Blackstar

In formal language theory, an alphabet, sometimes called a vocabulary (see Nonterminal Symbols), is a non-empty set of indivisible symbols/characters/,

(1991). 9780534923730, PWS-Kent. .
typically thought of as representing letters, characters, digits, , or even words.
(1994). 9780387942582, Springer. .
(2025). 9780073383095, . .
The definition is used in a diverse range of fields including logic, mathematics, computer science, and linguistics. An alphabet may have any ("size") and, depending on its purpose, may be (e.g., the alphabet of letters "a" through "z"), (e.g., \{v_1, v_2, \ldots\}), or even (e.g., \{v_x : x \in \mathbb{R} \}).

Strings, also known as "words" or "sentences", over an alphabet are defined as a of the symbols from the alphabet set.

(2025). 9781441912206, Springer. .
For example, the alphabet of lowercase letters "a" through "z" can be used to form English words like "iceberg" while the alphabet of both upper and lower case letters can also be used to form proper names like "Wikipedia". A common alphabet is {0,1}, the binary alphabet, and "00101111" is an example of a . Infinite sequences of symbols may be considered as well (see ).

It is often necessary for practical purposes to restrict the symbols in an alphabet so that they are unambiguous when interpreted. For instance, if the two-member alphabet is {00,0}, a string written on paper as "000" is ambiguous because it is unclear if it is a sequence of three "0" symbols, a "00" followed by a "0", or a "0" followed by a "00".


Notation
By definition, the alphabet of a formal language L over \Sigma is the set \Sigma, which can be any non-empty set of symbols from which every string in L is built. For example, the set \Sigma = \{\_,\mathrm{a}, \dots, \mathrm{z}, \mathrm{A}, \dots, \mathrm{Z}, 0, \mathrm{1}, \dots, \mathrm{9}\} can be the alphabet of the formal language L that means "all variable identifiers in the C programming language". Notice that it is not required to use every symbol in the alphabet of L for its strings.

Given an alphabet \Sigma, the set of all strings of length n over the alphabet \Sigma is indicated by \Sigma^n. The set \bigcup_{i \in \mathbb{N}} \Sigma^i of all finite strings (regardless of their length) is indicated by the operator as \Sigma^*, and is also called the Kleene closure of \Sigma. The notation \Sigma^\omega indicates the set of all infinite sequences over the alphabet \Sigma, and \Sigma^\infty indicates the set \Sigma^\ast \cup \Sigma^\omega of all finite or infinite sequences.

For example, using the binary alphabet {0,1}, the strings ε, 0, 1, 00, 01, 10, 11, 000, etc. are all in the Kleene closure of the alphabet (where ε represents the ).


Applications
Alphabets are important in the use of , and . In most cases, for defining instances of automata, such as deterministic finite automata (DFAs), it is required to specify an alphabet from which the input strings for the automaton are built. In these applications, an alphabet is usually required to be a , but is not otherwise restricted.

When using automata, regular expressions, or as part of string-processing , the alphabet may be assumed to be the of the text to be processed by these algorithms, or a subset of allowable characters from the character set.


See also
  • Combinatorics on words
  • Terminal and nonterminal symbols


Literature
  • John E. Hopcroft and Jeffrey D. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison-Wesley Publishing, Reading Massachusetts, 1979. .

Page 1 of 1
1
Page 1 of 1
1

Account

Social:
Pages:  ..   .. 
Items:  .. 

Navigation

General: Atom Feed Atom Feed  .. 
Help:  ..   .. 
Category:  ..   .. 
Media:  ..   .. 
Posts:  ..   ..   .. 

Statistics

Page:  .. 
Summary:  .. 
1 Tags
10/10 Page Rank
5 Page Refs